Testing the distance duality relation with present and future data 
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The assumptions that light propagates along null geodesies of the spacetime metric and the 
number of photons is conserved along the light path lead to the distance duality relation (DDR), 
77 = Dl(z)(1 + z)~ 2 1 'Da(z) = 1, with Dl(z) and Da(z) the luminosity and angular diameter dis- 
tances to a source at redshift z. In order to test the DDR, we follow the usual strategy comparing 
the angular diameter distances of a set of clusters, inferred from X-ray and radio data, with the 
luminosity distance at the same cluster redshift using the local regression technique to estimate 
Dl{z) from Type la Supernovae (SNela) Hubble diagram. In order to both strengthen the con- 
straints on the DDR and get rid of the systematics related to the unknown cluster geometry, we also 
investigate the possibility to use Baryon Acoustic Oscillations (BAO) to infer Da(z) from future 
BAO surveys. As a test case, we consider the proposed Euclid mission investigating the precision 
can be afforded on n(z) from the expected SNela and BAO data. We find that the combination of 
BAO and the local regression coupled allows to reduce the errors on n a = dr]/dz\ z= o by a factor two 
if one rjo = n(z = 0) = 1 is forced and future data are used. On the other hand, although the sta- 
tistical error on 770 is not significantly reduced, the constraints on this quantity will be nevertheless 
ameliorated thanks to the reduce impact of systematics. 

PACS numbers: 98.80.-k, 98.80.Es, 97.60. Bw 



I. INTRODUCTION 

The Etherington reciprocity theorem [l[ states that, if 
source and observer are in relative motion, solid angles 
subtended between the observer and the source are re- 
lated by geometrical invariants where the redshift of the 
source as measured by the observer enters in the rela- 
tion. First proven in the context of relativistic geomet- 
rical optics, it only relies on the two assumptions that 
light travels along null geodesies in a Riemannian space- 
time and that the number of photons is conserved 0. 
Altough often underrated, the Etherington reciprocity 
theorem actually plays a fundamental role in observa- 
tional cosmology with applications ranging from gravita- 
tional lensing [J|, to the CMBR temperature shift equa- 
tion T e = Tq/(1 + z) Q and the well known result that 
the surface brightness of a source does not depend on its 
distance to the observer. Among its different incarnation, 
a widely used formulation of the Etherington reciprocity 
theorem is represented by the so called distance duality 
relation (hereafter, DDR reading: 
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where Dl(z) and Da{z) are the luminosity (LD) and 
angular diameter distance (ADD). Having been derived 
from the reciprocity law, the DDR holds in whatever cos- 
mology provided the spacetime is Riemannian and there 
are no source of attenuation (like gray dust) or brighten- 
ing (as gravitational lensing). As such, one can take it 
for granted, but a more interesting possibility is to test 
it against astronomical observations. To this end, one 
should be able to measure, for a given z, both the LD 



and ADD by means of a standard candle and a standard 
ruler, respectively. From this point of view, Type la Su- 
pernovae (although standardizable rather than standard 
candles) are the ideal tool to estimate the LD as is in- 
deed routinely done when using their Hubble diagram 
to constrain cosmological parameters. On the contrary, 
ADDs are much more difficult to measure, but some sig- 
nificant steps forward have been recently based on the 
Sunyaev- Zel'dovich effect in galaxy clusters 0,0. Un- 
fortunately, while the method to estimate ADD from the 
measured temperature decrement is theoretically and ob- 
servationally well understood, the impact of systematics 
related to the cluster geometry and the plasma physics 
is still quite strong leading to contrasting conclusions on 
the DDR validity at any redshift @,H]. 

As a further issue, one has also to take care of the er- 
rors due to the mismatch between the cluster redshift and 
the closest SN in the companion SNela sample adopted. 
Different strategies have been implemented to avoid this 
problem (e.g., by rejecting the clusters for which no SN 
at the same z is available) or reduce its impact relying 
on the LD value inferred from SNela with |Az < 0.005 
and Az = z$n — z c i. As a possible way out of this issue, 
we present here a novel method relying on the local re- 
gression technique [l4[ to get a reliable LD estimate at 
exactly the same redshift as the cluster one. 

An alternative standard ruler is represented by the 
sound horizon r s , i.e. the comoving distance a sound 
wave could have traveled in a photon - baryon fluid by 
the time of decoupling. The importance of such a scale 
may be guessed noting that, at the time of recombina- 
tion, baryons wave stop to freely propagate in the ini- 
tial baryons - photons plasma thus leaving a density ex- 
cess at the sound horizon scale. Should galaxy form 
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at the centre of density perturbations, we should have 
observed a peak in the galaxy correlation function at 
this scale. Since the Fourier transform of such a peak 
would appear as an oscillating feature, the matter power 
spectrum should present oscillations at the correspond- 
ing wavenumber. Such oscillations have been indeed de- 
tected (ijlfioj and are now referred to as Baryon Acoustic 
Oscillations (BAO, see for a nice review). Should 
one be able to measure the power spectrum as function 
of both the parallel and transverse wave number at differ- 
ent redshift z, BAO would allow to determine the values 
of r s H(z) and DA(z)/r s , where H(z) is the Hubble ex- 
pansion rate. Although BAO data actually determine 
ADDs only up to the unknown sound horizon r s , it is 
worth noting that this latter quantity is well constrained 
by present day CMBR data with a precision which will 
likely increase as the Planck mission data [l2| will become 
available. Moreover, the inferred ADDs from BAO and 
the CMBR determination of r s will be free of the un- 
known systematics related to the cluster geometry and 
physics. We will therefore investigate here whether fu- 
ture BAO and SNela surveys can be combined together 
to strengthen the constrains on n(z) and detect any DDR 
violation. 

The plan of the paper is as follows. The local regression 
technique is presented in Section II and then used to infer 
the r){z) values from the present day SNela and cluster 
data. Section III investigates the constraints these data 
put on two different parameterizations of rj(z) highlight- 
ing to what extent they depend on the cluster geometry 
assumptions. The use of BAO as alternative standard 
rulers is presented in Section IV, where we also inves- 
tigate the constraints this method can impose on rj(z) 
relying on future SNela and BAO data which will be col- 
lected by the Euclid satellite. We then summarize and 
conclude in Section V. 



jects and then estimate the LD at z c i from the sample of 
LD measurements approximately matched for each z c \. 
Two strategies are possible to this end. First, one can 
simply take a weighted mean (with the inverse squared 
error as weights) or linearly interpolate the data. As we 
will show later, the choice of the LD estimate method and 
the value of A max have a non negligible impact on the 
constraints on the DDR parameters. In order to reduce 
this bias, one should make A max as small as possible, but 
this comes at the price of weakening the constraints so 
that finding the right compromise is an hard issue. 

As a possible way out of this problem, we resort here 
to the local regression (LR) technique [l4j to infer the 
distance modulus fx at the cluster redshift z c i from the 
companion SNela sample. The basic idea underlying LR 
relies on fitting simple models to localized subsets of the 
data to build up a function that describes the determin- 
istic part of the variation in the data, point by point. 
Actually, one is not required to specify a global function 
of any form to fit a model to the data so that there is 
no ambiguity in the choice of the interpolating function. 
Indeed, at each point, a low degree polynomial is fit to 
a subset of the data containing only those points which 
are nearest to the one whose response is being estimated. 
The polynomial is fit using weighted least squares with 
a weight function which quickly decreases with the dis- 
tance from the point where the model has to be recovered. 
We use the Union2 SNela sample as input to the local 
regression estimate of (J,(z) following the steps schemati- 
cally sketched below. 

1. Order the SNela according to increasing value of 
\z c i — Zi\ and select the first n = aNsNeia with a 
a user selected value and NsNeia the total number 
of SNela. 



II. AVOIDING REDSHIFT MISMATCH 



2. Define the weight function: 



Since the DDR involves the ratio between the values 
of the LD scaled by (1 + z)~ 2 and the ADD at the same 
redshift z, the first issue one has to tackle off is the dif- 
ficulty to exactly match the measurements of these two 
quantities. As an example, let us consider the ADD cat- 
alog assembled by Bonamente et al. (hereafter, B06) 
Q from 38 galaxy clusters spanning the redshift range 
(0.149, 0.890). To trace the LD, we will use the most up- 
dated SNela sample, namely the Union2 with 557 
SNela over the range (0.29,1.40). Should we decide to 
only use the LD and ADD measurements with exactly 
the same z in the two catalogs, we would have obtained 
a sample of only 13 objects with large error bars so that 
the results on testing the DDR would likely be quite poor. 

In an attempt to strengthen the constraints, one there- 
fore adopt an approximate matching by selecting only 
those clusters which have at least one SN with \Az\ < 
0.001 (0.005), one finds 32 (38) ob- 
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where u = \z c i — Zi\/A and A the maximum value 
of the \zd ~ zA over the subset chosen before. 



3. Fit a first order polynomial to the data se- 
lected at step (ii) weighting each SNela with the 
corresponding value of the function W(u) and 
take the zeroth order term as best estimate of n(z). 



4. Estimate the error on //(z) as the root mean square 
of the weighted residuals with respect to the best 
fit zeroth order term. 
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It is worth stressing that both the choice of the weight 
function and the order of the fitting polynomial are some- 
what arbitrary. Similarly, the value of a to be used 
must not be too small in order to make up a statisti- 
cal valuable sample, but also not too large to prevent the 
use of a low order polynomial. In flEl ] (which we refer 
the reader to for any detail), an extensive set of simu- 
lations were performed to both check the reliability of 
the LR method and look for a possible value of a. It 
was there shown that setting a = 0.025 allows to recover 
the input distance modulus typically within 0.35% (and 
with deviations never larger than 1%) independent on the 
redshift z and the cosmological model adopted (at least 
within the large class of dark energy equation of states 
considered). We will therefore adopt the above proce- 
dure to estimate the distance modulus and then the LD, 
D L (z) = dex[(ii - 25)/5] (with dex(x) = 10 x ) for all the 
clusters in the ADD catalogs we will use later. 



III. DDR VS PRESENT DAY DATA 

Testing the validity of the DDR is the same as check- 
ing that the parameter rj(z) defined in Eq.([T]) is strictly 
constant and unity at all z. To this end, it is convenient 
to phenomenologically parameterize this quantity so that 
deviations from the validity of the DDR can be expressed 
in a quantitative way. Inspired by the analogy with the 
dark energy equation of state, two common expressions 
adopted in literature read @, Q : 

!Va + Vaz/(l + z) 
(3) 
Vo + Va In (1 + z) 

so that the DDR is never violated if (r]o,i] a ) ~ (1,0). 
It is worth noting that, while the first formula predicts 
that rj(z) asymptotically approaches the constant value 
Vo + Va at high z (so that one can formally have a viola- 
tion of DDR at low redshift but recover it for z — > oo if 
i]d + rja = 1), the second expression formally diverges at 
infinity so that it must be considered as a low z approxi- 
mation only. We nevertheless include it both to compare 
our results with previous ones and to allow for a quickly 
varying T}{z) (noting that, for the same ry a , the logarith- 
mic ansatz increases faster than the first expression). 

As a second remark, it is worth spending some words 
on the value of 770 . If one assumes that the Robertson - 
Walker metric holds (i.e., the universe is homogenous and 
isotropic on large scales), one gets Dl(z = 0) = Da(z = 
0) and hence ijq = 1 independent on whether the DDR 
holds or not. However, such a result breaks down if pho- 
tons are absorbed or emitted along their lig ht path or, 
put in other words, the effective opacity [16[ of the uni- 
verse is not zero. In such a case, one can still have a 
homogenous and isotropic universe and nevertheless a 
value of 770 7^ 1 so that we will explore both one pa- 



rameter models forcing 770 = 1 and two parameters cases 
constraining its value from the fit to the data. 

The two expressions in Eq.Q provide a purely phe- 
nomenological approach to testing the DDR. As a dif- 
ferent method, it is also possible to assume a model for 
the absorption and/or production of photons due to in- 
teractions with, e.g., axion- like particles or a work out a 
different mechanism leading to a non vanis hing and red- 
shift dependent effective opacity (see, e.g., [16[ and refs. 
therein for some interesting examples). The price to pay 
is, however, to introduce a dependence of the fitting re- 
sults on both the underlying cosmological model and the 
opacity production phenomenon parameters. Since the 
number and quality of the present day data is far from 
being good enough, we have here preferred to adopt a 
model independent approach relying on the above two 
phenomenological expressions. 

As input dataset, we follow the common approach us- 
ing the Union2 SNela sample as LD tracer and two dif- 
ferent galaxy cluster samples with X - ray and SZ data 
to measure the ADDs. The first one is the catalog of 
25 clusters assembled by Dc Filippis et al. (p|, hereafter 
DeF05), while the second one is made out 38 clusters and 
will be referred to here as the B06 [|| sample. It is worth 
stressing that, although the data and the method used to 
determine the ADD of each cluster are the same, the two 
samples differ for a critical assumption. Indeed, while 
B06 assumes a spherical geometry, DcF05 explicitly cor- 
rect their estimates taking care of their constraints on 
the ellipsoidal cluster geometry. As amply discussed in 
literature [|| , the assumption of a spherical or ellipsoidal 
geometry has a great impact on the ADD determination 
so that the estimated rj values are not consistent with 
each other. As a consequence, the constraints on (770, rj a ) 
will also depend on which sample is used and cannot be 
straightforwardly compared. 

In order to constrain the parameters, we resort to the 
usual x 2 analysis, i.e., we minimize the merit function: 



ilobs(zi)(l + A ) - Vth(zi,p) 



with ri b s (zi) and rjth(zi, p) the observed and theoretically 
predicted T)(z) value at redshift z t , Oi the measurement 
uncertainty and p = rj a or p = (770,77a) for the one and 
two parameters models, respectively. Eq.Q contains an 
additional term (1 + Ao) which we have introduced to 
take into account a systematic uncertainty on the LD as 
inferred from the SN distance modulus. Indeed, since the 
absolute SN magnitude is known up to ±0.05 mag, the 
LD can be shifted by a factor A ~ ±2.3%. We therefore 
add this as a nuisance parameter and marginalize over it 
with a Gaussian prior centred on (A ) = and with 
standard deviation ctq = 0.023. As a far as we know, this 
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is the first time such a term is taken into account 1 , while 
neglecting it can artificially reduce the uncertainties on 
the inferred constraints on the model parameters p. 

The best fit parameters will be obtained by minimiz- 
ing the x 2 merit function, while the 68% (95%) con- 
fidence limits will be found by imposing A\ 2 = 1.0 
(Ax 2 = 4.0). To this end, we first integrate the likelihood 
C(r) ,Va, A ) oc exp[-x 2 (77o,?7a,A )/2]exp-[Ag/(2(T 2 )] 
over all the parameters but the one of interest. We then 
define \ 2 = — 21n£i (with Li the marginalized likelihood 
for the i-th parameter) and find the 68% and 95% CL 
solving the above relations for A% 2 . 



A. Taking care of redshift mismatch 

Before discussing the results on (?yo,?7a) from fitting 
the above dataset, it is worth spending some time to 
explicitly show the impact of redshift mismatch and why 
we advice the reader to avoid it using the local regression 
technique (or a whatever reliable method to estimate the 
LD at the same cluster redshift). 

To this aim, we build up simulated cluster and SNela 
samples as close as possible to the actual ones. First, we 
choose a fiducial cosmological model assuming a spatially 
flat universe with matter density parameter Q,m = 0.27, 
constant dark energy equation of state, w = —0.95 and 
Hubble constant (in units of 100 km/s/Mpc) h = 0.703, 
consistent with the recent WMAP7 |T7| results. We then 
choose the B06 sample as a reference case and assign to 
each cluster in this sample an ADD equals to kDa(z) 
with Da(z) the theoretical value and k randomly chosen 
between (0.98, 1.02) to mimic a possible mismatch due 
to statistical and/or systematic errors. To each value, 
we then attach a measurement uncertainty in such a way 
that the relative error equals the one for the ADD of 
the cluster in the B06 sample having the same z. For 
the simulated SNela sample, we adopt a similar proce- 
dure the only difference being that we generate the dis- 
tance modulus (rather than the LD) from a Gaussian 
distribution centred on the theoretical value and with 
variance = (fx s i m / Hobs)&obsi but never smaller than 
0mt = 0.15, this value being the intrinsic scatter of the 
SNela peak magnitude. The same scaling of the errors is 
then used to assign a statistical uncertainty to the simu- 
lated fx(z) for each SN in the sample. 

The simulated cluster and SNela datasets are then 
used to estimate r](z) at the cluster redshifts using two 
different ways to deal with the problem of redshift mis- 
match. First, we take as LD at each z c i the error weighted 
average of the SNela with \Az\ < 0.005 thus obtaining 2 : 



We thank the anonymous referee for suggesting its inclusion. 
2 We discuss only the results obtained fitting the first rj(z) model 
in Eq.©, but our conclusions on the impact of redshift mismatch 
are qualitatively the same for the other paramctrization. More- 



T] a = -0.071 ±0.100 
when forcing rjo = 1, and 

?/o = 0.940 ± 0.085 , r) a = 0.226 ± 0.475 

for the two parameters case with the reported errors re- 
ferring to the 68% confidence range 3 . Such a test shows 
that, although the value r\ a = is well within the 68% 
confidence ranges, the best fit value may be severely bi- 
ased if one does not force ?yo = 1. Since it is reason- 
able to expect that the error bars will shrink with future 
data, we can argue that averaging over the SNela with 
|Az| < 0.005 can introduce a systematic bias which is 
larger than the statistical uncertainty. 

Actually, averaging is only zero order approximation 
so that one can suppose that a linear interpolation of the 
Dl(z) values within this range works much better. Using 
this approach, we find : 

t] a = -0.163 ±0.080 
for the one parameter model, and 

r) = 0.836 ± 0.052 , r, a = 0.148 ± 0.171 

when 770 is left free. It is evident that the bias on ij a is 
still present for the two parameters model. Somewhat 
surprisingly, the linear interpolation method has wors- 
ened rather than ameliorated the situation. Actually, 
this is partly a consequence of the smaller number of clus- 
ters used which makes the fit more sensible to deviations 
from the DDR ansatz because of statistical fluctuations. 
Note that the dataset only contains now 28 clusters since, 
for ten of them, we have too few points (less than four 
objects) in the \Az\ < 0.005 SNela subset to define a 
reliable interpolation. 

Finally, let us consider the results obtained using local 
regression to estimate fx(z) and then Dl(z) for each clus- 
ter in the simulated sample. Fitting the one parameter 
model, we get : 

T] a = -0.005 ±0.126 , 
while, when 770 is fitted too, we find : 



over, we report the values obtained by a single simulation, but 
we have checked that they arc qualitatively the same running 
~ 100 realizations of the LD and ADD datasets. 
3 Note that the marginalized distribution are very close to Gaus- 
sian so that the 68% confidence range may be taken as a la error 
and 95% CL obtained by doubling the lc uncertainty. Hereafter, 
we will therefore report only this estimate of the Icr error. 
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Sample rja (770, i]a)bf 770 rja 

B06 -0.331 ± 0.129 (0.899, -0.192) 0.915 ± 0.078 -0.195 ± 0.311 
DeF05 -0.622 ± 0.232 (0.719,0.280) 0.751 ±0.091 0.292 ± 0.538 

B06 -0.273 ±0.125 (0.896,0.150) 0.911 ± 0.067 -0.153 ± 0.223 
DeF05 -0.530 ± 0.217 (0.727,0.210) 0.758 ±0.097 0.220 ± 0.419 

TABLE I: Constraints on DDR test quantity 77(a) after marginalizing over Ao. Columns are as follows : 1. cluster sample used, 
2. median and 68% confidence range for r\ a forcing 770 = 1.0, 3. best fit (770,77a) values for the two parameter model, 4., 5. 
median and 68% confidence ranges for (770,77a). Upper (lower) half of the table refers to the first (second) ansatz in Eq.©. 



770 = 1.002 ± 0.100 , 77 a = -0.022 ±0.291 . 

Compared to the averaging method, we clearly see that 
the bias on r\ a is reduced both for one and two parameter 
models and, as a further positive outcome, we also get a 
median 770 value quite close to the input one. We can 
therefore safely conclude that the local regression tech- 
nique does not bias the constraints on (j]o,r] a ) and con- 
fidently advocate its use to test the DDR avoiding any 
systematic error due to the redshift mismatch problem. 

B. Present day constraints 

Motivated by the above discussion, we now use the lo- 
cal regression technique to infer the LD of the clusters in 
the B06 and DeF05 samples using the SNela Union2 sam- 
ple as input. We then fit the data thus obtained with the 
four models introduced in Section II and summarize the 
results in Table I. Not surprisingly, the confidence ranges 
are quite large so that it is not statistically possible to 
definitively conclude whether the DDR holds or not at 
any z. It is worth noting that a qualitatively similar con- 
clusion is also achieved in previous works. Indeed, the 
constraints in Table I are fully consistent with those in 
@, @] , although we remark that a straightforward com- 
parison should be avoided given the radically different 
approach to the redshift mismatch problem. Moreover, 
we have also included the term (1 ± Ao) in Eq.Q which 
has the double impact of introducing a degeneracy in the 
parameters space and enlarging the confidence ranges. 

It is worth investigating how the constraints depend 
on the assumed r)(z) parameterization. Comparing the 
constraints on ?7 a for both the one and two parameters 
models in the upper and lower half of Table I, we see 
that the logarithmic ansatz may be reconciled with the 
data only if smaller r/ a values are used. This is an ex- 
pected result considering that, for the same 770 value (as, 



e.g., for the one parameter case), a smaller rj a partially 
compensates for the different scalings with z of the two 
cases considered. Although somewhat expected, this re- 
sult highlights the importance of choosing a reliable pa- 
rameterization for T\{z) in order to better check the DDR 
validity at any z. On the contrary, what is the func- 
tional expression for rj(z) has only a minor impact on 
the 770 constraints. Indeed, for a fixed sample, the 68% 
confidence ranges are well overlapped for the two r/(z) 
expressions so that one could draw conclusions on 770 in 
a roughly model independent way. 

Table I shows that, actually, the larger impact on the 
constraints is due to the sample used, that is to say on 
the assumptions on the cluster geometry. Indeed, both 
for models with 770 = 1 or left free to fit, the B06 sam- 
ple give values of r\ a closer to zero than the DeF05 one. 
Moreover, when 770 is free to vary, the B06 sample recov- 
ers 770 = 1 within 2cr, while a significantly smaller value, 
770 ~ 0.76, is obtained with the DeF05 sample leading to 
770 < 1 at more than 2.7c. Since the SNela companion 
sample used is the same, it is likely that the difference has 
to be ascribed to how the ADD has been estimated from 
the cluster data. In particular, since 770 < 1 has been 
obtained, one should argue that the LD has been under- 
estimated or the ADD is overestimated. Investigating in 
details this issue is outside our aims. We only stress that 
the uncertainty on the cluster geometry is likely to not 
be reduced with improved observations being related to 
projection effects. As a consequence, this source of sys- 
tematic error is hard to be fully taken under control also 
with future data. 



IV. DDR VS FUTURE DATA 

In order to escape the uncertainties on the cluster ge- 
ometry, one must rely on a different tracer to estimate the 
ADD at a given redshift z. Baryon Acoustic Oscillations 
immediately stand out as ideal candidates to this aim. In- 



FIG. 1: Simulated data for an Euclid -like mission. Left. SNela redshift distribution (normalized so that the area under the 
histogram is 1). Centre. Angular diameter distance data. Right. Inferred 77(2) using local regression and the simulated SNela. 



deed, the precise determination of the galaxy power spec- 
trum as function of both the radial and tangential compo- 
nent of the wave vector allows to constrain D A{z m ed) I 't s 
and H(z me d)r s , r s being the sound horizon, and z me d 
the median redshift of the survey. Assuming that such a 
measurement is available, one can then rewrite the DDR 
in terms of the scaled ADD Da(z) = Da{.z)/t s as : 



D L {z)(l + z)~ 2 D L (z)(l + z)- 



D A {z) D a (z) 
This can be conveniently rewritten as 



not be present at all. Unfortunately, while the available 
SNela samples are numerous enough to allow a decent re- 
construction of (J,(z) through the local regression method, 
present day BAOs measurements only allow to constrain 
r s /D v (z), with D v (z) = [cz(l + z) 2 D 2 A (z)/H(z)}^ 3 the 
so called volume distance @. We have therefore to rely 
on future data to apply the test outlined above. Note 
that waiting for future data is a valid help also for im- 
proving LD estimates from SNela. Indeed, next to come 
SNela surveys will both increase the statistics and offer 
a better control of the systematics so that we can re- 
duce the errors on the reconstructed LD thanks to both 
a larger subsample for each z and outliers rejection. 



r s rj(z) 



D L (z)(l + zY 
D A (z) 



(5) 



so that the rhs only contains observable quantities, while 
the lhs is a function of the sound horizon distance r s 
(which is a constant) and the parameters entering the 
adopted t](z) phcnomenological expression. Let us now 
suppose that a galaxy survey has the possibility to de- 
termine the power spectrum in Mbao bins with suffi- 
cient accuracy to provide Mbao measured values of the 
scaled ADD Da{z) with z a characteristic redshift of 
the bin (e.g., the central or the median value). We 
can then resort to local regression on SNela to esti- 
mate the LD at the sampled z and then get a catalog of 
Dl(z)(1 + z) 2 1 Da{z) measurements. This sample could 
then be fitted to an assumed r](z) model, but this only 
gives constraints on r s r] and r s ri a . However, the sound 
horizon distance r s is well constrained by CMBR data 
in a (almost) model independent way and with an error 
which can be as small as 0.1% according to what is fore- 
casted for Planck. We can therefore assume that r s is 
known and directly use the ADD as Da(z) = t s Da(z) so 
that the same fitting analysis used with cluster data can 
be implemented for ADDs traced by the BAO. 

The combination of BAOs to infer ADDs and local 
regression to estimate LD at the same ADD redshift al- 
lows us to get a set of measured values for r](z) which 
is free from the two most problematic systematic errors 
that can mimic a deviation of the DDR even if such a 
violation of the Ethcrington reciprocity theorem should 



The simulated dataset 



In order to investigate the potential of combined 
BAO + SNela to constrain the DDR, we rely here on sim- 
ulated data assuming an Euclid - like survey. Euclid [l8| 
is a candidate ESA mission to map the geometry and the 
evolution of the dark universe to an unprecedented pre- 
cision setting high accuracy constraints on dark matter, 
dark energy and modified gravity. To this end, two inde- 
pendent cosmological probes will be used, namely weak 
gravitational lensing and BAO, measuring the shape and 
the spectra of galaxies over ~ 15000 deg 2 of extragalac- 
tic sky in both visible (down to ~ 24.5 AB mag in the 
visible wide R+I+Z filter) and NIR (up to 24 mag in Y, 
J, H filters), up to redshift z ~ 2. A deep survey (two 
magnitudes deeper than the wide) over a 40 deg 2 area 
will also be conducted for legacy science and could offer 
the possibility to detect ~ 3000 SNela. The possibility to 
both measure BAO and build up a SNela catalog makes 
Euclid an ideal tool to provide all the ingredients we need 
to check the validity of the DDR so that we use this mis- 
sion as test case for our proposed method assuming the 
fiducial cosmological model described in Section II. 
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1. SNela data 

Let us briefly describe how we simulate the SNela sam- 
ple 4 . As a first step, we choose template light curves for 
each SN type (not only SNela, but also IIP, IIL, Iln and 
Ibc) as well as SN rates as function of redshift. Starting 
from the results of the LOSS survey for the magni- 
tude peak and the Gaussian SN mag distribution, Mon- 
tecarlo simulations are then performed generating arti- 
ficial SNe (with expected total counts computed from 
the above template) and random redshifts and explosion 
epochs. Depending on the survey strategy, one can then 
compute the total number of SNe of each type which are 
detected at least one time and then impose some cuts on 
the number of epoch each SN is detected. Such cuts then 
allow to finally get the number of SNela which could be 
used for cosmology (i.e., that have a sufficiently well sam- 
pled lightcurve to determine their distance modulus) and 
their redshift distribution. To this end, we assume an ob- 
servational strategy consisting in a first two months phase 
spent monitoring a 20 deg 2 field to a depth of 24.4 mag 
at a 4 days cadence. This is immediately followed by a 
period with a 10 days cadence for 15 epochs to a depth of 
24 mag. Then this same setup is repeated over a second 
20 deg 2 patch of the sky thus finally giving us a sample 
of 3053 SNela with 0.03 < z < 1.37 and z med = 0.78 
with a redshift distribution plotted in the left panel of 
Fig.[U It is worth noting that the actual strategy that 
will be implemented by Euclid has still to be decided. 
We nevertheless stress that the expected SNela number 
is the same as what we are getting here so that we can 
confidently rely on our simulated dataset as a first guess 
of an Euclid - like catalog. To each SN in the sample, we 
estimate the error on the distance modulus as 



impie 
0: 



sys 



+ {z/z 



) 2 rr 2 

max) u m 



(0) 



with z m ax the maximum redshift of the sample, <j sys 
an irreducible scatter and a m depending on the photo- 
metric accuracy. Although these number have still to 
be computed for the Euclid SN survey 5 , we set here 



4 The code we used has been developed to investigate how many 
SNe can be detected by an Euclid - like survey notwithstanding 
their type. As such, although we are only interested in SNela, 
we will automatically get for free also core collapse SNe. 

5 A different and more detailed strategy to forecast the precision 
on the distance modulus determination from the SN lightcurve 
has been described in [2lll . We have preferred to not use their 
method since it involves a lot of further unknown parameters 
(such as the SALT2 color correction terms) thus introducing a 
degree of arbitrariness in the simulations that we prefer to avoid. 
It is worth noting, however, that they use a smaller value of <Ti„t 
so that their uncertainties are likely smaller than our ones. As 
such, should their method turn out to be more reliable than 
our phenomenological formula, the results presented here would 
overestimate the impact of uncertainties thus leading to a con- 
servative estimate of the final constraints on the DDR quantities. 



(z max ,cr S y S ,a m ) = (1.4,0.15,0.02) mimicking a typical 
space based survey. Denoting with ^j^(z) the predicted 
value from our fiducial cosmological model, we then as- 
sign to each SN, a distance modulus randomly generated 
from a Gaussian distribution centred on /iy^(z) and vari- 
ance <Jfi{z) from Eq.Q above. The measurement error 
is finally set to (i obs (z) = [cr M (z)/ ' Hfid{z)]lJ>obs{z) thus fi- 
nally obtaining the simulated SNela dataset we need as 
input to the local regression technique. 



2. BAO data 

We now discuss the simulated ADD measurements 
from BAO. To this end, we use the method developed 
and tested in 22] to forecast the percentage error on 
DA(z)/r s from a BAO survey as a function of both the 
fiducial cosmological model and the survey characteris- 
tics. To this end, it is worth first remembering that Eu- 
clid will perform slitless spectroscopy for galaxies with an 
Ha flux down to / ~ 4 x 10 -16 erg s _1 cm -2 so that its 
main target will be star forming galaxies. Such an infor- 
mation is important to both estimate the redshift number 
distribution of detectable sources and the linear bias to 
be applied to match the matter and galaxy power spec- 
tra. Following [23( 1 . we will assume a 20000 deg 2 survey 
over the redshift range 0.5 < z < 2 with dN/dz obtained 
by multiplying the one in [24| by a success rate e = 0.35 
for a conservative choice, while the linear bias varies with 
the redshift according to the model in [25] . Different from 
[23l | . we consider 16 equally spaced redshift bins with bin 
width Az = 0.1 in order to increase the number of Da{z) 
measurements, but we stress that we can actually esti- 
mate 7](z) only for the first 9 bins since the SNela sample 
does not extend to z > 1.3 so that no LD determination 
is available for the higher redshift bins. 

Two further ingredients are needed before using the 
[22| code. First, one has to set the spectral index of 
scalar perturbations, denoted as n s , and the variance of 
density perturbations in a sphere of radius 8/i _1 Mpc, 
usually referred to as a$ . In agreement with the WM AP / 
results, we choose (n s ,as) = (0.96,0.809). Finally, in or- 
der to avoid the problem of modeling nonlinear effects, 
we cut the power spectrum to a maximum wavenum- 
ber k max determined by solving a 2 (l/k maxi z) = 0.25, 
with cr 2 (i?, z) the variance over the scale R = 1/k for the 
power spectrum at redshift z. Note that this leads to a 
redshift dependent upper limit on the usable power spec- 
trum, although a conservative good approximation is to 
set k m ax ~ 0.15/i Mpc -1 independent on z. 

The code then outputs <J S± , i-C, the error on 
hiDA(z)/r s so that, if we assume that the error on r s 
is negligible, we simply get <td a / Da = cr s± . As a simpli- 
fying (but yet realistic assumption), we will associate this 
error to the ADD measurement at the centre of the red- 
shift bin. We then generate Da (z) from a Gaussian dis- 
tribution centred on the fiducial ADD and with variance 
equal to the one outputtcd from the code and finally scale 
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the measurement error according to the ratio between the 
simulated and fiducial distance. The data thus generated 
are shown in the central panel of Fig.[TJ while the right 
panel plots the inferred rj(z) measurements using the lo- 
cal regression technique to estimate the LD for the BAO 
ADDs measurements (up to z = 1.3). 

B. Constraints on DDR parameters 

The above simulated datasct are input to the same fit- 
ting procedure analysis we have used for the present day 
data. We start by discussing the results for one represen- 
tative realization of the SNela and BAO data. For the 
one parameter models (i.e., with 770 = 1), we get 6 : 

rj a = 0.001 ±0.067 
for the first case in Eq.(|3]), and 

rj a = 0.001 ± 0.047 

for the second ansatz. A comparison with the results 
for the simulated case using local regression discussed 
at the end of Section III A shows that, although we now 
use a smaller dataset (only 9 instead of 38 points), the 
errors on rj a have been reduced by a factor two. Such a 
large reduction is a consequence of two effects. On one 
hand, the increased size of the SNela sample (by a factor 
ten) allows to have more points in each of the local bins 
used to fit the low order polynomial used in the local 
regression method thus reducing the error on Dl(z). On 
the other hand, BAO data allows to measure Da(z) with 
an accuracy of order 5% so that the final uncertainty on 
rj(z) is quite small. As a result, the lower statistics offered 
by this method is more than compensated by the far 
better precision thus shrinking the r\ a confidence ranges. 
When ?7o is left free, we find : 

770 = 0.994 ±0.180 , r] a = -0.016 ± 0.321 , 



j]a = 0.940 ±0.175 , % = -0.009 ± 0.173 , 

for the two models in Eq.Q. Compared to the present 
day simulated data, we now find that the constraints on 
r/o are actually poorer, while the opposite result is ob- 
tained, instead, for the rj a parameters whose confidence 



Since we are dealing with simulated datasets, the best fit values 
have no particular meaning so that we could also report only the 
lcr uncertainties. We have nevertheless preferred to give also the 
best fit (rjo ,r) a ) in order to show that there is no bias induced by 
the simulations and the fitting method. 



ranges are smaller. While the first result is a consequence 
of the lower statistics which is no more compensated by 
the increased precision because of the presence of two pa- 
rameters to fit, the improvement in the r\ a constraints is 
related to the larger redshift range probed by the BAO 
data. It is, however, worth stressing that the statistical 
uncertainties on (770, rj a ) coming out from the fit are ac- 
tually not the only source of error. As we have seen when 
fitting the present day data, systematic errors can also be 
larger than the statistical ones and bias the inferred best 
fit values. From this point of view, the BAO method is 
free from this problem so that should be preferred over 
the clusters as an ADD tracer. 

Finally, we check whether the method used is able to 
recover the input parameters. To this end, we have run 
~ 100 realizations of the SNela and BAO future data and 
repeated the fit for each of them. For the one parame- 
ter models, averaging the median r\ a over the full set of 
simulations, we find: 

{Va) = -0.001 ± 0.004 , (r? a } = -0.001 ± 0.003 , 

for the linear and logarithmic r](z) ansatz, respectively, 
and where the error is the standard deviation of the (ap- 
proximately) Gaussian distribution of the results. Leav- 
ing ryo as a free parameter, we get : 

(770) = 0.98 ±0.02 , (rj a ) = -0.01 ± 0.03 , 
for the linear model, and 

(770) = 0.94 ± 0.01 , (r) a ) = 0.00 ± 0.02 , 

for the logarithmic one. Such results suggest that the 
median r\ a value outputtcd from the fit is on average 
consistent with the input one for both the linear and 
logarithmic model independent on the use of the 770 = 1 
assumption. On the contrary, 770 is less well recovered 
because of the degeneracy with the nuisance A param- 
eter. Although this could add a note of caution in using 
the proposed method, it is nevertheless worth stressing 
that, for all the simulations, the statistical error on 770 is 
roughly the same as the one reported above for the rep- 
resentative case. As a consequence, the value 770 = 1 is 
always well within the la error so that we conclude that 
the bias is not statistically meaningful. 

V. CONCLUSIONS 

It is common to say that we are living in the era of 
precision cosmology. While this is only partly true to- 
day, one can be confident that future data will us make 
enter an epoch where we can not only improve the pre- 
cision on the constraints on a given cosmological model, 
but also test the cornerstones of observational cosmol- 
ogy. Although its importance is usually underrated, the 
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Ethcrington reciprocity law stands out as one of the fun- 
damental pillars our interpretation of astrophysical data 
is based on. Next to come surveys will have the sufficient 
quality to promote the distance duality relation (which is 
the most used incarnation of the Etherington law) from 
an a priori theoretical assumption to the rank of a rela- 
tion which can be observationally validated. 

In order to test the validity of the DDR, one needs 
to trace both the luminosity and angular diameter dis- 
tance for a set of redshift values. We have here followed 
the usual approach relying on clusters data to estimate 
the ADD and SNela as LD tracer. We have, however, 
improved the standard analysis by introducing the local 
regression technique to avoid the redshift mismatch prob- 
lem (i.e., the difference between the redshift of the cluster 
and those of the SNela used to infer the corresponding 
LD). This simple and widely tested method allows to 
strongly reduce the bias on the f](z) parameters thus in- 
creasing the reliability of the constraints and hence the 
test of the DDR validity. Unfortunately, the poor quality 
of the cluster ADDs determination still leads to large con- 
fidence ranges preventing to draw any statistically mean- 
ingful conclusion on the violation of the DDR over the 
redshift range probed by the available data. Moreover, 
the results strongly depend on the assumptions on the 
cluster geometry so that one should first find a method 
to correct for this effect or propagate this uncertainty on 
the final error on the (j]Q,r] a ) parameters introduced to 
quantitatively check the DDR validity. 

In an attempt to escape this problem, we have here 
proposed to use BAO as an alternative ADD tracer. Be- 
ing the physics of BAO well understood, the systemat- 
ics connected with this method can be easily quantified 
and satisfactorily corrected for with future galaxy sur- 
veys data. Since the present day data are too poor to 
implement this test in an efficient way, we have relied 
on simulated samples of both BAO ADD measurements 
and SNela distance moduli determinations considering a 
fiducial Euclid mission as source of both datascts. Such 
an analysis has highlighted the virtues of the proposed 
approach showing that the error bars are halved if one 
forces r/Q = 1. When this assumption is abandoned, we 



find only a modest decrease of the relative uncertainty on 
t]q with respect to present day data, but the constraints 
on r\ a are still strengthened by a factor two. Moreover, 
the lack of systematic errors makes this approach highly 
preferable over the use of cluster data as ADD tracers. 

It is worth noting that the proposed approach does not 
exploit the full potential of BAO. Indeed, while BAO al- 
lows to determine the ADD up to redshift z = 2, the 
quantity n(z) can only be determined up to z = 1.4, 
this latter being the maximum redshift available tested 
by the SNela Hubble diagram. In order to push further 
this limit, one could rely on a different SNela survey able 
to detect a statistically meaningful number of objects at 
z > 1.4 with sufficient precision. As an alternative ap- 
proach, one should find a different LD tracer. Gamma 
ray bursts (GRBs) stand out as ideal candidates from 
this point of view since they can be detected up to z ~ 8 
[2r| thanks to the huge energy released during the explo- 
sion. Unfortunately, the use of GRBs as standardizablc 
candles is still in its infancy so that, notwithstanding the 
first released GRBs Hubble diagrams [HI, HtJ , a careful 
analysis of the systematics has still to be fully done (but 
see, e.g., [28| for recent encouraging results). Should fu- 
ture data validate the GRBs as LD tracer, one could use 
them as input to the local regression technique and trace 
Tj{z) over the full redshift range probed by BAO surveys. 

As a final remark, it is worth noting that the proposed 
method will allow not only to check the foundations of 
observational cosmology by giving an empirical valida- 
tion of the universally assumed Ethcrington law, but also 
open the way to completely new physics should this test 
find out a statistically meaningful violation of the DDR. 
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